restaurant process
Spatial distance dependent Chinese restaurant processes for image segmentation
The distance dependent Chinese restaurant process (ddCRP) was recently introduced to accommodate random partitions of non-exchangeable data [1]. The dd-CRP clusters data in a biased way: each data point is more likely to be clustered with other data that are near it in an external sense. This paper examines the dd-CRP in a spatial setting with the goal of natural image segmentation. We explore the biases of the spatial ddCRP model and propose a novel hierarchical extension better suited for producing "human-like" segmentations. We then study the sensitivity of the models to various distance and appearance hyperparameters, and provide the first rigorous comparison of nonparametric Bayesian models in the image segmentation domain. On unsupervised image segmentation, we demonstrate that similar performance to existing nonparametric Bayesian models is possible with substantially simpler models and algorithms.
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Comparing Nonparametric Bayesian Tree Priors for Clonal Reconstruction of Tumors
Deshwar, Amit G., Vembu, Shankar, Morris, Quaid
Statistical machine learning methods, especially nonparametric Bayesian methods, have become increasingly popular to infer clonal population structure of tumors. Here we describe the treeCRP, an extension of the Chinese restaurant process (CRP), a popular construction used in nonparametric mixture models, to infer the phylogeny and genotype of major subclonal lineages represented in the population of cancer cells. We also propose new split-merge updates tailored to the subclonal reconstruction problem that improve the mixing time of Markov chains. In comparisons with the tree-structured stick breaking prior used in PhyloSub, we demonstrate superior mixing and running time using the treeCRP with our new split-merge procedures. We also show that given the same number of samples, TSSB and treeCRP have similar ability to recover the subclonal structure of a tumor.
- North America > Canada > Ontario > Toronto (0.15)
- Asia > Middle East > Jordan (0.04)
- Health & Medicine > Therapeutic Area > Oncology > Leukemia (1.00)
- Health & Medicine > Therapeutic Area > Hematology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Therapeutic Area > Musculoskeletal (0.94)
Transfer Topic Modeling with Ease and Scalability
Kang, Jeon-Hyung, Ma, Jun, Liu, Yan
The increasing volume of short texts generated on social media sites, such as Twitter or Facebook, creates a great demand for effective and efficient topic modeling approaches. While latent Dirichlet allocation (LDA) can be applied, it is not optimal due to its weakness in handling short texts with fast-changing topics and scalability concerns. In this paper, we propose a transfer learning approach that utilizes abundant labeled documents from other domains (such as Yahoo! News or Wikipedia) to improve topic modeling, with better model fitting and result interpretation. Specifically, we develop Transfer Hierarchical LDA (thLDA) model, which incorporates the label information from other domains via informative priors. In addition, we develop a parallel implementation of our model for large-scale applications. We demonstrate the effectiveness of our thLDA model on both a microblogging dataset and standard text collections including AP and RCV1 datasets.
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- Asia > Middle East > Jordan (0.05)
- South America > Chile (0.04)
- (8 more...)
- Government > Regional Government > North America Government > United States Government (0.68)
- Government > Military (0.46)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.89)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Spatial distance dependent Chinese restaurant processes for image segmentation
Ghosh, Soumya, Ungureanu, Andrei B., Sudderth, Erik B., Blei, David M.
The distance dependent Chinese restaurant process (ddCRP) was recently introduced toaccommodate random partitions of non-exchangeable data [1]. The dd-CRP clusters data in a biased way: each data point is more likely to be clustered with other data that are near it in an external sense. This paper examines the dd-CRP in a spatial setting with the goal of natural image segmentation. We explore the biases of the spatial ddCRP model and propose a novel hierarchical extension bettersuited for producing "humanlike" segmentations. We then study the sensitivity of the models to various distance and appearance hyperparameters, and provide the first rigorous comparison of nonparametric Bayesian models in the image segmentationdomain. On unsupervised image segmentation, we demonstrate that similar performance to existing nonparametric Bayesian models is possible with substantially simpler models and algorithms.
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Hierarchical Topic Models and the Nested Chinese Restaurant Process
Griffiths, Thomas L., Jordan, Michael I., Tenenbaum, Joshua B., Blei, David M.
We address the problem of learning topic hierarchies from data. The model selection problem in this domain is daunting--which of the large collection of possible trees to use? We take a Bayesian approach, generating anappropriate prior via a distribution on partitions that we refer to as the nested Chinese restaurant process. This nonparametric prior allows arbitrarilylarge branching factors and readily accommodates growing data collections. We build a hierarchical topic model by combining this prior with a likelihood that is based on a hierarchical variant of latent Dirichlet allocation. We illustrate our approach on simulated data and with an application to the modeling of NIPS abstracts.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > United States > California > Alameda County > Berkeley (0.14)
- Asia > Middle East > Jordan (0.05)
- (2 more...)
- Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.91)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)
Hierarchical Topic Models and the Nested Chinese Restaurant Process
Griffiths, Thomas L., Jordan, Michael I., Tenenbaum, Joshua B., Blei, David M.
We address the problem of learning topic hierarchies from data. The model selection problem in this domain is daunting--which of the large collection of possible trees to use? We take a Bayesian approach, generating an appropriate prior via a distribution on partitions that we refer to as the nested Chinese restaurant process. This nonparametric prior allows arbitrarily large branching factors and readily accommodates growing data collections. We build a hierarchical topic model by combining this prior with a likelihood that is based on a hierarchical variant of latent Dirichlet allocation. We illustrate our approach on simulated data and with an application to the modeling of NIPS abstracts.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > United States > California > Alameda County > Berkeley (0.14)
- Asia > Middle East > Jordan (0.05)
- (2 more...)
- Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.91)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)
Hierarchical Topic Models and the Nested Chinese Restaurant Process
Griffiths, Thomas L., Jordan, Michael I., Tenenbaum, Joshua B., Blei, David M.
We address the problem of learning topic hierarchies from data. The model selection problem in this domain is daunting--which of the large collection of possible trees to use? We take a Bayesian approach, generating an appropriate prior via a distribution on partitions that we refer to as the nested Chinese restaurant process. This nonparametric prior allows arbitrarily large branching factors and readily accommodates growing data collections. We build a hierarchical topic model by combining this prior with a likelihood that is based on a hierarchical variant of latent Dirichlet allocation. We illustrate our approach on simulated data and with an application to the modeling of NIPS abstracts.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > United States > California > Alameda County > Berkeley (0.14)
- Asia > Middle East > Jordan (0.05)
- (2 more...)
- Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.91)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)